Towards Automatic Capturing of Semi-structured Process Provenance
نویسندگان
چکیده
Often data processing is not implemented by a workflow system or an integration application but is performed manually by humans along the lines of a more or less specified procedure. Collecting provenance information in semistructured processes can not be automated. Further, manual collection of provenance information is error prone and time consuming. Therefore, we propose to infer provenance information based on the file read and write access of users. The derived provenance information is complete, but has a low precision. Therefore, we propose further to introducing organizational guidelines in order to improve the precision of the inferred provenance information.
منابع مشابه
Using Domain Ontologies to Help Track Data Provenance
Traditional techniques for tracking data provenance have difficulty adapting to the dynamics of the Web. This paper proposes a scheme for provenance estimation, based on domain ontologies. This scheme is part of the POESIA approach for multi-step integration of semi-structured data. The ontologies used for tracking provenance also help to describe, discover, reuse and integrate data and service...
متن کاملThe Aspect-Oriented Architecture of the CAPS Framework for Capturing, Analyzing and Archiving Provenance Data
With aspect-oriented programming techniques, modularity may be achieved via separating cross-cutting concerns. Data provenance can be considered as a crosscutting concern: code for collecting provenance data is usually scattered across various places in a software system. Aspect-oriented programming allows to seamlessly integrate cross-cutting concerns into existing software applications withou...
متن کاملTowards Automatic Capturing of Manual Data Processing Provenance
Often data processing is not implemented by a workflow system or an integration application but is performed manually by humans along the lines of a more or less specified procedure. Collecting provenance information during manual data processing can not be automated. Further, manual collection of provenance information is error prone and time consuming. Therefore, we propose to infer provenanc...
متن کاملTowards Automated Collection of Application-Level Data Provenance
Gathering data provenance at the operating system level is useful for capturing system-wide activity. However, many modern programs are complex and can perform numerous tasks concurrently. Capturing their provenance at this level, where processes are treated as single entities, may lead to the loss of useful intra-process detail. This can, in turn, produce false dependencies in the provenance g...
متن کاملDetailed Provenance Capture of Data Processing
A large part of Linked Data generation entails processing the raw data. However, this process is only documented in human-readable form or as a software repository. This inhibits reproducibility and comparability, as current documentation solutions do not provide detailed metadata and rely on the availability of specific software environments. This paper proposes an automatic capturing mechanis...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012